Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix byte/str handling for python3 #2

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Fix byte/str handling for python3 #2

wants to merge 6 commits into from

Conversation

ndevenish
Copy link
Member

@ndevenish ndevenish commented May 14, 2021

These are the patches from yayahjb/cbflib#19.

Problem: All char * fields, input[1] and output are mapped to str. This works on Python 2, but on Python 3 causes problems that your binary data is now encoded in a string and needs to be converted via encode('utf-8', errors='surrogateescape') (see SWIG docs). This can be worked around by setting SWIG_PYTHON_STRICT_BYTE_CHAR in the build - but now all char * fields are bytes, meaning that you need to encode any strings that you are passing into pycbf (see e.g. cctbx/dxtbx@44b6d72).

This commit fixes this. When using Python 3, functions that return data will return it as a python bytes object, and those that accept data will accept a python bytes object. Everything else will accept str.

[1] Input field mapping is mostly solved by 647ffcb, which means that anything that accepts an explicitly string argument will accept both string and bytes. However, we don't want data fields to accept strings - so input handling is still important.

  • Need to check this strictly doesn't conflict with the byte/string dual handling from 647ffcb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant